Detection of Wideband Interference

ABSTRACT

A method of detecting interference in a received sample vector using hidden Markov modelling by first estimating noise variance, where estimating noise variance comprises the steps of receiving a sample vector of noise and interference, sorting the sample vector in the frequency domain by order of increasing magnitude to produce an ordered vector, finding a sub-vector of the ordered vector that minimises the distance from a noise measure, and estimating the noise variance.

FIELD OF INVENTION

The present invention is a method and system of estimating noise power and detecting interference with particular application to cognitive radio systems.

BACKGROUND

Radio spectrum is typically allocated by Government organisations. Portions of the radio spectrum are licensed to users for particular use. For example AM and FM radio bands are licensed to different radio service providers. Unlicensed users of spectrum are generally tracked down and stopped from using licensed portions of the spectrum. Some areas of the spectrum are unlicensed and provide a free for all for any number of users. Examples of unlicensed spectrum include the spectrum in which baby monitors, garage door openers and cordless telephones operate; this is the 2.4 GHz ISM band.

One of the inefficiencies of licensing radio spectrum is that inefficient use is made of the licensed spectrum. With a spectrum license to a radio station, only that radio station can use its licensed portion of spectrum. It may be possible for other users to use the same spectrum without causing interference to the licensed party but under the current licensing terms, this is not allowed.

The commercial success of unlicensed spectrum is leading to developments in which many wireless users can operate in the same frequency band and share the spectrum. Some existing systems to allow this include dynamic frequency selection and transmit power control. These new concepts have lead to a new smart wireless system called cognitive radio. An important aspect of cognitive radio behaviour is the ability to reliably detect the present of interference from other systems for competing spectrum use. This is necessary both in the cognitive radio receiver to protect the integrity of the wanted signal and also on the cognitive radio transmitter, which must avoid producing interference on other systems. At the receiver, robust interference detection allows the system to identify which channels or bands should be avoided. Further, where the system in question is broadband, the interference is relatively narrow, it is possible for the system to co-exist in the same spectrum as the interfering system with minimal disruption to either system.

Interference can either be wideband or narrowband. Detection of narrowband interference is based on well established principles including maximum likelihood estimation, exploiting the inherent cyclostationarity of narrowband signals. However, as more wideband communication systems are developed and standardised, interference increasingly occurs between wideband systems. In this interference the spectral use of one system may fully or partially overlap the spectral use of another system. While some wideband modulations such as orthogonal frequency division multiplexing exhibit cyclostationarity, wideband systems are not inherently cyclostationary. Wideband interference detection must be based primarily on received power spectrum density. Moreover, where wideband interference occurs between heterogeneous systems, each cognitive radio receiver must either view the receiver signal of the others as being generated by an unknown stochastic process or else incorporate a full physical layer receiver for each interfering system. Some existing systems use an FFT to determine the power in each frequency bin. If the power is more than a threshold value, the detector determines that there is interference. This method is a simple interference temperature method. Another proposed system is to use a control channel common to all users of the spectrum. The control channel will provide information on the spectral usage of each user. However, such a system relies on different manufacturers all agreeing to the same protocols and frequency for the control channel. The control channel system avoids finding and suppressing interference. In the worse case, the system is completing blind to any interferers; the interferers are wideband with unknown frequency and unknown statistics.

SUMMARY OF INVENTION

It is the object of the present invention to provide an alternative method of estimating noise variance and method of detecting wideband interference, or to at least provide the public with a useful choice.

In broad terms in one aspect the invention comprises a method of estimating noise variance comprising the steps of receiving a sample of noise and interference, sorting the sample in the frequency domain by order of increasing magnitude to produce an ordered vector, finding a sub-vector of the ordered vector that minimises the distance from a noise measure, and estimating the noise variance.

Preferably the sub-vector is found as

{right arrow over ({circumflex over (X)}_(Z)

[{right arrow over (X)}₁ . . . {right arrow over (X)}_(N) _(Z) ]

where {right arrow over ({circumflex over (X)}_(Z) is the sub-vector and {right arrow over (X)}₁ . . . {right arrow over (X)}_(N) _(Z) are elements of the sub-vector.

In one embodiment the sub-vector is found as the vector that minimises the distance from Gaussianity, such that

${\hat{N}}_{z} = {\arg \; {\min\limits_{N_{z}}{D\left( {\overset{\rightarrow}{X}}_{N_{z}} \right)}}}$

where D() is some measure of Gaussianity. The measure of Gaussianity may by kurtosis or the Kolmogorov-Smirnov distance. If the Kolmogorov-Smirnov distance is used the real and imaginary parts of X can be evaluated from Gaussian, the distance of |X| from Rayleigh or the distance of. |X|² from exponential can be evaluated.

In broad terms in another aspect the invention comprises a method of detecting interference comprising the steps of receiving a sample vector X of noise and interference, finding a set of parameters that maximise p(X|T,P,I) where T is the transition probability between states, P is the observation probability and I is the initial state, determining a state sequence from the set of parameters and determining wideband interference from the state sequence.

In one embodiment the step of finding a set of parameters that maximise p(X|T,P,I) includes the step of estimating noise variance. Noise variance can be estimated using a number of techniques including receiving a sample vector of noise and interference, sorting the sample vector in the frequency domain by order of increasing magnitude to produce an ordered vector, finding a sub-vector of the ordered vector that minimises the distance from a noise measure, and estimating the noise variance.

The step of finding a set of parameters that maximise p(X|T,P,I) may further comprise the steps of producing a nominal interference variance and initialising a state sequence.

Preferably the nominal interference variance may be found as {circumflex over (σ)}_(I) ²

r{circumflex over (σ)}_(Z) ² where {circumflex over (σ)}_(I) ² is the interference variance r is a threshold value and {circumflex over (σ)}_(Z) ² is the noise variance.

Preferably the optimal state sequence is found as

$ = {\arg \mspace{14mu} {\max\limits_{}\mspace{11mu} {{p\left( {\left. X \middle|  \right.,,,\mathcal{I}} \right)}{{p\left( {\left.  \middle|  \right.,,\mathcal{I}} \right)}.}}}}$

In one embodiment the optimal state sequence is produced using the Viterbi algorithm.

The term “comprising” as used in this specification and claims means “consisting at least in part of”. That is to say, when interpreting statements in this specification and claims which include “comprising”, the features prefaced by this term in each statement all need to be present but other features can also be present. Related terms such as “comprise” and “comprised” are to be interpreted in a similar manner.

BRIEF DESCRIPTION OF DRAWINGS

The invention will be further described by way of example only and without intending to be limiting with reference to the following drawings, wherein:

FIG. 1 shows a receiver system,

FIG. 2 shows an example of a received interference sample vector spectrum,

FIG. 3 shows an example of interference classification using the generalized likelihood ratio test on the example interference spectrum of FIG. 2,

FIG. 4 shows an example of interference classification using the method of the invention on the example interference spectrum of FIG. 2,

FIGS. 5A and 5B are graphs showing the detection performance of several different wideband detection methods, and

FIGS. 6A and 6B are graphs showing the detection performance of several different wideband detection methods, and

FIG. 7 shows a receiver.

DETAILED DESCRIPTION

FIG. 1 shows a conventional receiver structure. The received signal is bandlimited to the approximate desired signal bandwidth B by band pass filter 1. The bandlimited signal is down-converted to baseband and complex samples are obtained at a rate f_(s)≧B sufficient to prevent aliasing of the desired signal. Initial detection of interferers takes place during a time-limited signal-free period such as an N sample inter-frame space. Thus, the receiver obtains a column vector of N signal samples

$\begin{matrix} {x = {{\sum\limits_{i = 1}^{I}{h_{i}s_{i}}} + z}} & 1 \end{matrix}$

where s_(i) is the N length vector representing the ith interferer experiencing N×N circulant fading channel matrix h_(i) and z is additive white Gaussian noise (AWGN) having variance σ_(Z) ² per sample. Application of the N×N discrete Fourier transform (DFT) matrix D to x produces

$\begin{matrix} {X = {{Dx} = {{\sum\limits_{i = 1}^{I}{H_{i}S_{i}}} + Z}}} & 2 \end{matrix}$

where H_(i) is the diagonal N×N channel transfer function matrix for the ith interferer, S_(i) is the N length DFT of s_(i) and Z is the DFT of AWGN vector z having both real part Z^(R)

Re{Z} and imaginary part Z^(I)

Im{Z} distributed according to the Gaussian probability density function (PDF)

$\begin{matrix} {{{p_{Z}\left( Z^{R} \right)}\overset{\Delta}{=}{{\frac{1}{\sqrt{2\pi}\sigma_{Z}}{\exp \left( {- {\frac{1}{2}\left\lbrack \frac{Z^{R}}{\sigma_{Z}} \right\rbrack}^{2}} \right)}}\overset{\Delta}{=}{\prod\limits_{n = 1}^{N}\; {\frac{1}{\sqrt{2\pi}\sigma_{Z}}{\exp \left( {- {\frac{1}{2}\left\lbrack \frac{Z_{n}^{R}}{\sigma_{Z}} \right\rbrack}^{2}} \right)}}}}},} & 3 \end{matrix}$

where σ_(Z) ² is the frequency domain dual of σ_(z) ².

The joint PDF of the real and imaginary parts of the noise is

$\begin{matrix} {{p_{Z}^{J}(Z)}\overset{\Delta}{=}{{p_{Z}\left( {Z^{R},Z^{I}} \right)} = {{{p_{Z}\left( Z^{R} \right)}{p_{Z}\left( Z^{I} \right)}} = {\frac{1}{2{\pi\sigma}_{Z}^{2}}{\exp \left( {- \frac{{Z}^{2}}{2\sigma_{Z}^{2}}} \right)}}}}} & 4 \end{matrix}$

and the noise magnitude PDF is

$\begin{matrix} {{P_{Z}^{M}\left( {Z} \right)} = {\frac{Z}{\sigma_{Z}^{2}}{{\exp \left( {- \frac{{Z}^{2}}{2\sigma_{Z}^{2}}} \right)}.}}} & 5 \end{matrix}$

The following four assumptions are made concerning the interference.

1. The duration of the received sample x is sufficiently short that all interference is present or absent in every time sample (vector element) and that H is not time-varying. 2. The ith interferer s_(i) is strictly bandlimited such that

$\begin{matrix} {{\langle{S_{in}}^{2}\rangle} = \left\{ {\begin{matrix} {\sigma_{i}^{2},} & {{n \in k_{i}}\overset{\Delta}{=}\left\lbrack {a_{i}\mspace{14mu} \ldots \mspace{14mu} b_{i}} \right\rbrack} \\ {0,} & {n \notin k_{i}} \end{matrix},} \right.} & 6 \end{matrix}$

where S_(i,n) is the nth element of the ith interferer S_(i), (b_(i)≧a_(i))ε[0 . . . N−1] and σ_(i) ²=L_(i) σ _(i) ²/N, for L_(i)=b_(i)−a_(i)+1, is the frequency domain dual of variance σ _(i) ², the power of the ith interferer as sampled in x. 3. There is no restriction on the overlap of interferers. Interferers may overlap fully, partially or not at all. The interference model includes narrowband interferers as well as wideband interferers. 4. The ith interferer S_(i) is white complex Gaussian distributed with zero mean and autocovariance C_(S) _(i) _(S) _(i) =. σ_(i) ²I, and s_(i) is complex Gaussian, bandlimited according to equation 6 with zero mean and autocovariance C_(S) _(i) _(S) _(i) =

s_(i)s^(H) _(i)

=D

S_(i)S^(H) _(i)

D^(H)=σ_(i) ²DW_(i)D^(H)=σ_(i) ²T_(i) where D is the DFT matrix, W_(i) is a diagonal window matrix such that

$\begin{matrix} {{W_{i} = {{diag}\left( \left\lbrack {W_{0}W_{1}\mspace{14mu} \ldots \mspace{14mu} W_{N - 1}} \right\rbrack \right)}},\mspace{14mu} {W_{n} = \left\{ {\begin{matrix} {1,} & {n \in k_{i}} \\ {0,} & {n \notin k_{i}} \end{matrix},} \right.}} & 7 \end{matrix}$

and T_(i) is a Toeplitz matrix where the first row is the circular sinc function having kth element

$\begin{matrix} {T_{i,{1\; k}} = {\frac{\sin \left( {\pi \; k\; {L_{i}/N}} \right)}{\sin \left( {\pi \; {k/N}} \right)}{{\exp \left( {{- j}\; \pi \; {{kb}_{i}/N}} \right)}.}}} & 8 \end{matrix}$

FIG. 2 shows the amplitude spectrum of a received sample vector of length N=100 produced using this model. The underlying sample vector was randomly generated and arbitrarily chosen. The dashed line in FIG. 2 shows the time average interference amplitude (root variance) assumed in each frequency bin. In this example, four interferers of varying bandwidths and mean power spectral densities were simulated. One interferer, present in bins 38 to 44, sits entirely within the spectrum of a second interferer. The mean noise amplitude (root variance) per frequency bin is indicated by the dotted line in FIG. 2. Note also that one interferer, present in bins 64 to 77, has a root variance that is almost identical to the noise root variance. In other words this interferer has a mean interference to noise ratio approximately equal to one. In this example both noise and interference per bin were modelled as zero mean independent, identically distributed (iid) complex Gaussian processes having the indicated root variance. It can be seen in this example that the sample vector amplitude in some “interference” bins is quite low even though the mean interference to noise variance ratio is quite large, for example bins 16, 34, and 35. Conversely, the sample vector amplitude in some “noise-only” bins is well above the mean, as is to be expected given the Rayleigh amplitude distribution.

Received signal statistics are developed for two fading channel scenarios. In the Rayleigh fading channel scenario, elements of the channel transfer function matrix for the ith interferer H_(i) are correlated complex Gaussian random variables. The PDF of U_(i,n) ^(R)=Re{H_(i,nn)S_(i,n)} the real part of an individual, noise-free frequency bin in which interferer i is present, is

$\begin{matrix} {{{p_{U}\left( U_{i,n}^{R} \right)} = {\frac{1}{2\; \Omega_{i}}{\exp \left( {- \frac{U_{i,n}^{R}}{\Omega_{i}}} \right)}}},} & 9 \end{matrix}$

where H_(i,nn) is the (nn)th element of the channel transfer function matrix for the ith interferer H_(i), and S_(i), is the nth element of S_(i), Ω

σ_(i)σ_(H) _(i) for

$\sigma_{H_{i}}^{2} = {\frac{1}{2}{\langle{H_{i}}^{2}\rangle}}$

the power gain of the ith multipath channel. The PDF is the same for the imaginary part of an individual, noise-free frequency bin.

In equation 9, the PDF of the real part of an individual frequency bin in which one or more interferers is present, may be derived by first deriving the first characteristic function of X_(n) ^(R)

$\begin{matrix} {{\Phi_{X}(\theta)} = {\prod\limits_{i \in M}\; {\frac{1}{1 + {\Omega_{i}^{2}\theta^{2}}}{\exp \left( {{- \frac{1}{2}}\sigma_{Z}^{2}\theta^{2}} \right)}}}} & 10 \end{matrix}$

Therefore, it follows that the PDF of the real part of a frequency bin of the received sample X_(n) ^(R) is given by

$\begin{matrix} {{p_{X\; 1}\left( X_{n}^{R} \right)} = {\sum\limits_{i \in M}{{\exp \left( \frac{\sigma_{Z}^{2}}{2\; \Omega_{i}^{2}} \right)}{\frac{\Omega_{i}}{4{\prod\limits_{\underset{j \neq i}{j \in M}}\; \left( {\Omega_{i}^{2} - \Omega_{j}^{2}} \right)}}\begin{bmatrix} {{{\exp \left( \frac{X_{n}^{R}}{\Omega_{i}} \right)}{{erfc}\left( {\frac{\sigma_{Z}}{\sqrt{2}\Omega_{i}} + \frac{X_{n}^{R}}{\sqrt{2}\sigma_{Z}}} \right)}} +} \\ {{\exp \left( {- \frac{X_{n}^{R}}{\Omega_{i}}} \right)}{{erfc}\left( {\frac{\sigma_{Z}}{\sqrt{2}\Omega_{i}} - \frac{X_{n}^{R}}{\sqrt{2}\sigma_{Z}}} \right)}} \end{bmatrix}}}}} & 11 \end{matrix}$

noting that this also is the PDF of the imaginary part of a frequency bin of the received sample.

$X_{n}^{I}\overset{\Delta}{=}{{Im}{\left\{ {{\sum\limits_{i \in M}{H_{i,{nn}}S_{i,n}}} + Z_{n}} \right\}.}}$

The joint PDF

p _(X1) ^(J)(X _(n))

p _(X1)(X _(n) ^(R) ,X _(n) ^(I))=p _(X1)(X _(n) ^(R))p _(X1)(X _(n) ^(I))  12

readily can be evaluated through substitution of equation 11, but the magnitude PDF

p _(X1) ^(M)(|X _(n)|)=|X _(n)|∫_(−π) ^(π) p _(X1)(|X _(n)|cos φ)p _(X1)(|X _(n)|sin φ)dφ  13

requires numerical integration.

In the Gaussian channel scenario, either line-of-sight conditions exist or otherwise multipath fading is negligible. In this case the channel transfer matrix for the ith interferer is the interferer variance times the identity matrix, for all interferers H_(i)=σ_(i)I,∀i. The PDF of the real part of the received sample X_(n) ^(R) reduces to

$\begin{matrix} \begin{matrix} {{p_{X\; 2}\left( X_{n}^{R} \right)} = {p_{X\; 2}\left( X_{n}^{I} \right)}} \\ {= {\frac{1}{\sqrt{2\; {\pi\left( {\sigma_{Z}^{2} + {\sum\limits_{i \in M}\Omega_{i}^{2}}} \right)}}}{\exp\left( {- \frac{\left( X_{n}^{R} \right)^{2}}{2\left( {\sigma_{Z}^{2} + {\sum\limits_{i \in M}\Omega_{i}^{2}}} \right)}} \right)}}} \end{matrix} & 14 \end{matrix}$

In this case, the joint PDF directly can be written as

$\begin{matrix} {{p_{X\; 2}^{J}\left( X_{n} \right)}\overset{\Delta}{=}{{p_{X\; 2}\left( {X_{n}^{R},X_{n}^{I}} \right)} = {\frac{1}{2\; {\pi\left( {\sigma_{Z}^{2} + {\sum\limits_{i \in M}\Omega_{i}^{2}}} \right)}}{\exp\left( {- \frac{{X_{n}}^{2}}{2\left( {\sigma_{Z}^{2} + {\sum\limits_{i \in M}\Omega_{i}^{2}}} \right)}} \right)}}}} & 15 \end{matrix}$

and the magnitude PDF is

$\begin{matrix} {{p_{X\; 2}^{M}\left( {X_{n}} \right)} = {\frac{X_{n}}{\sigma_{Z}^{2} + {\sum\limits_{i \in M}\Omega_{i}^{2}}}{\exp\left( {- \frac{{X_{n}}^{2}}{2\left( {\sigma_{Z}^{2} + {\sum\limits_{i \in M}\Omega_{i}^{2}}} \right)}} \right)}}} & 16 \end{matrix}$

Moments and cumulants of the recovered sample X also are of interest.

For the Rayleigh fading case, from the first characteristic function of the real part of the received sample in the nth bin X_(n) ^(R), equation 10, the second characteristic function is

$\begin{matrix} {{\Psi_{X}(\theta)} = {{- {\sum\limits_{i \in M}{\log \left( {1 + {\Omega_{i}^{2}\theta^{2}}} \right)}}} - {\frac{1}{2}\sigma_{Z}^{2}\theta^{2}}}} & 17 \end{matrix}$

By exploiting the moment generating property of the first characteristic function, equation 10, and the cumulant generating property of the second characteristic function, equation 17, the first two even order moments (μ₂, μ₄) and cumulants (κ₂, κ₄) of X_(n) ^(R) are

$\begin{matrix} {{{{\mu_{2}\left( X_{n}^{R} \right)}\overset{\Delta}{=}{{\langle\left( X_{n}^{R} \right)^{2}\rangle} = {\sigma_{Z}^{2} + {\sum\limits_{i \in M}\Omega_{i}^{2}}}}}{{\kappa_{2}\left( X_{n}^{R} \right)} = {\sigma_{Z}^{2} + {\sum\limits_{i \in M}\Omega_{i}^{2}}}}{\mu_{4}\left( X_{n}^{R} \right)}\overset{\Delta}{=}{{\langle\left( X_{n}^{R} \right)^{4}\rangle} = {{3\sigma_{Z}^{4}} + {12\; \sigma_{Z}^{2}{\sum\limits_{i \in M}\Omega_{i}^{2}}} + {24{\sum\limits_{i \in M}\Omega_{i}^{4}}}}}}{{\kappa_{4}\left( X_{n}^{R} \right)} = {12{\sum\limits_{i \in M}{\Omega_{i}^{4}.}}}}} & 18 \end{matrix}$

The odd order moments and cumulants of X_(R) are identically zero. The kurtosis of X_(n) ^(R) is

$\begin{matrix} {{\gamma_{2}\left( X_{n}^{R} \right)}\overset{\Delta}{=}{{{\frac{\mu_{4}}{\mu_{2}^{2}} - 3} \equiv \frac{\kappa_{4}}{\kappa_{2}^{2}}} = {12{\sum\limits_{i \in M}{\frac{\Omega_{i}^{4}}{2\left( {\sigma_{Z}^{2} + {\sum\limits_{i \in M}\Omega_{i}^{2}}} \right)^{2}}.}}}}} & 19 \end{matrix}$

For the Gaussian channel case, the first two even order moments and cumulants of the real part of the received sample in the nth bin X_(n) ^(R) can be found to be

$\begin{matrix} {{{{\mu_{2}\left( X_{n}^{R} \right)}\overset{\Delta}{=}{{\langle\left( X_{n}^{R} \right)^{2}\rangle} = {\sigma_{Z}^{2} + {\sum\limits_{i \in M}\Omega_{i}^{2}}}}}{{\kappa_{2}\left( X_{n}^{R} \right)} = {\sigma_{Z}^{2} + {\sum\limits_{i \in M}\Omega_{i}^{2}}}}{\mu_{4}\left( X_{n}^{R} \right)}\overset{\Delta}{=}{{\langle\left( X_{n}^{R} \right)^{4}\rangle} = {3\left( \; {\sigma_{Z}^{2} + {\sum\limits_{i \in M}\Omega_{i}^{2}}} \right)^{2}}}}{{\kappa_{4}\left( X_{n}^{R} \right)} = 0.}} & 20 \end{matrix}$

As above, the odd order moments and cumulants of X_(n) ^(R) are identically zero. The kurtosis of X_(n) ^(R) in the Gaussian channel case is zero.

For the noise-only case, where X^(R)≡η^(R), the first four moment and cumulants are identically zero except the second and fourth order moments μ₂(Z^(R))=κ₂(Z^(R))=σ_(Z) ², and μ₄(Z^(R))=3σ_(Z) ⁴. The kurtosis of Z^(R) also is zero.

Common to many detection methods is estimation of the noise variance of the received sample X. This is essentially an initial classification of X into noise and interference, and may be achieved as follows.

First sort X by order of increasing magnitude to produce {right arrow over (X)}.

Then find {right arrow over ({circumflex over (X)}_(Z)

[{right arrow over (X)}₁ . . . {right arrow over (X)}_(N) _(Z) ] the sub-vector of {right arrow over (X)} which minimizes distance from Gaussianity, such that

$\begin{matrix} {{\hat{N}}_{Z} = {\arg \; {\min\limits_{N_{Z}}{D\left( {\overset{\rightarrow}{X}}_{N_{Z}} \right)}}}} & 21 \end{matrix}$

where D(•) is some measure of Gaussianity, such as kurtosis or the Kolmogorov-Smirnov (K-S) distance. From equation 19, a moment method estimate of kurtosis which serves the purpose of D(•) is

$\begin{matrix} {D_{\gamma_{2}}\overset{\Delta}{=}{{\hat{\gamma}}_{2} = {\frac{2\; N_{Z}{\sum\limits_{n = 1}^{N_{Z}}{{Re}\left\{ X_{n} \right\}^{4}}}}{\left\lbrack {\sum\limits_{n = 1}^{N_{Z}}{{Re}\left\{ X_{n} \right\}^{2}}} \right\rbrack^{2}} + \frac{2\; N_{Z}{\sum\limits_{n = 1}^{N_{Z}}{{Im}\left\{ X_{n} \right\}^{4}}}}{\left\lbrack {\sum\limits_{n = 1}^{N_{Z}}{{Im}\left\{ X_{n} \right\}^{2}}} \right\rbrack^{2}} - 3.}}} & {22\; a} \end{matrix}$

The K-S distance may be evaluated as

D _(K-S)

max {|p _(Z)({right arrow over (X)} _(N) _(Z) <x)−N _(Z) /N _(Z)|},  22b

where p_(Z)(X<x) is the cumulative distribution function (CDF) for dummy variable x and N_(Z)

[1 . . . N_(Z)].

In general, the K-S distance is more robust then kurtosis as it involves all moments. Kurtosis involves the fourth cumulant (or second and fourth moments) only. In calculating the K-S distance, it is equally valid to evaluate the distance of the real and imaginary parts of X from Gaussian, the distance of the |X|from Rayleigh, or the distance of |X|² from exponential. Truncated cumulative distribution functions (CDFs) are required to evaluate the K-S test. For data truncated at T_(X)=|{right arrow over (X)}_(N) _(Z) ₊₁|, the truncated Rayleigh CDF is

$\begin{matrix} {p_{Z}\left( {{X} < {x{{{X\left. {< T_{X}} \right)} = {\frac{\left. {1 - {\exp \left( {{{- x^{2}}/2}\sigma_{Z}^{2}} \right)}} \right)}{\left. {1 - {\exp \left( {{{- T_{X}^{2}}/2}\sigma_{Z}^{2}} \right)}} \right)}.}}}}} \right.} & 23 \end{matrix}$

Next the noise variance is estimated. To produce the desired result, and in order to evaluate the distance measure D(−), the noise variance is evaluated using a maximum likelihood estimate over truncated data. From equation 23, the variance estimate based on the Rayleigh distribution truncated at T_(X) requires numerical solution of

$\begin{matrix} {{{\hat{\sigma}}_{z}^{2} - {\frac{T_{X}}{2}\left\lbrack \frac{\exp \left( {{- T_{X}^{2}}/\left( {2{\hat{\sigma}}_{Z}^{2}} \right)} \right)}{1 - {\exp \left( {{- T_{X}^{2}}/\left( {2{\hat{\sigma}}_{Z}^{2}} \right)} \right)}} \right\rbrack}} = {\frac{1}{2\; N_{Z}}{\sum\limits_{n = 1}^{N_{Z}}{{{\overset{\rightarrow}{X}}_{n}}^{2}.}}}} & 24 \end{matrix}$

For each frequency bin the receiver must evaluate which of two hypotheses is most likely. The bin contains noise only or the bin contains noise and interference. Mathematically the hypotheses are expressed as

$\begin{matrix} \begin{matrix} {{\mathcal{H}_{0}\text{:}\mspace{14mu} X_{n}} = Z_{n}} & {n \in \left\lbrack {1\mspace{14mu} \ldots \mspace{14mu} N} \right\rbrack} \\ {{\mathcal{H}_{1}\text{:}\mspace{14mu} X_{n}} = {{\sum\limits_{i \in M}{H_{i,{nn}}S_{i,n}}} + Z_{n}}} & {n \in {\left\lbrack {1\mspace{14mu} \ldots \mspace{14mu} N} \right\rbrack.}} \end{matrix} & 25 \end{matrix}$

In this equation M⊂[1 . . . I] is the subset of interferers present in the nth frequency bin.

Complicating this problem is that there are an unknown number of different signal models within the received sample X, that is, any given frequency sample X_(n) can comprise any number of interferers between 0 and I. Thus, over the entire signal spectrum vector X, the hypotheses are that the bin contains noise only, and that the bin contains noise and a set of interferers for all different sets of interferers. Mathematically, the hypotheses are expressed as:

$\begin{matrix} \begin{matrix} {{\mathcal{H}_{0}\text{:}\mspace{14mu} X} = Z} & {n \in \left\lbrack {1\mspace{14mu} \ldots \mspace{14mu} N} \right\rbrack} \\ {{\mathcal{H}_{1}\text{:}\mspace{14mu} X} = \left\{ \begin{matrix} {{\sum\limits_{i \in M_{0}}{H_{i,{1_{0}1_{0}}}S_{i,1_{0}}}} + Z_{1_{0}}} & {n \in 1_{0}} \\ \vdots & \; \\ {{\sum\limits_{i \in M_{K}}{H_{i,{1_{K}1_{K}}}S_{i,1_{K}}}} + Z_{1_{K}}} & {n \in 1_{K}} \end{matrix} \right.} & \; \end{matrix} & 26 \end{matrix}$

In the above equation M₁∪M₂∪ . . . ∪M_(K)=[1 . . . I] index K sets of interferers, l_(l)⊂[0 . . . N−1] is a vector of indices of adjacent frequency bins containing an identical number of interferers, nothing that [l₁∪l₂∪ . . . ∪l_(K)]≡[k₁∪k₂∪ . . . ∪k_(I)], and H_(i,l) _(l) _(l) _(l) denotes the square diagonal sub-matrix of H_(i) with rows and columns selected by elements of l_(l)l₀ is the vector of indices of all frequency bins, not necessarily adjacent, containing noise only, and M₀ is the empty set of interferers.

Hypothesis probabilities may be compared using the generalized likelihood ratio test (GLRT)

$\begin{matrix} \begin{matrix} {{L_{G}(X)} = {\frac{p_{X}\begin{pmatrix} {{X{{\hat{M}}_{1}\mspace{14mu} \ldots \mspace{14mu} {\hat{M}}_{K}}},{\hat{1_{1}}\mspace{14mu} \ldots \mspace{14mu} {\hat{1}}_{K}},{\hat{\sigma}}_{Z}^{2},} \\ {{{\hat{C}}_{{HS}_{1}{HS}_{1}}\mspace{14mu} \ldots \mspace{14mu} {\hat{C}}_{{HS}_{1}{HS}_{1}}},\mathcal{H}_{1}} \end{pmatrix}}{p_{Z}\left( {{X{\hat{\sigma}}_{Z}^{2}},\mathcal{H}_{0}} \right)} > \Gamma_{1}}} \\ {= {\max\limits_{{M_{1}{\ldots M}_{K}},{1_{1}{\ldots 1}_{K}}}\frac{p_{X}\left( {{X{\hat{\sigma}}_{Z}^{2}},{{\hat{C}}_{{HS}_{1}{HS}_{1}}\mspace{14mu} \ldots \mspace{14mu} {\hat{C}}_{{HS}_{1}{HS}_{1}}},\mathcal{H}_{1}} \right)}{p_{Z}\left( {{X{\hat{\sigma}}_{Z}^{2}},\mathcal{H}_{0}} \right)}}} \end{matrix} & 27 \end{matrix}$

where C_(HS) _(i) _(HS) _(i) is the covariance of H_(i)S_(i), p_(X)(X| . . . ) is the multivariate probability density function (PDF) of X and p_(Z)(X| . . . ) is the noise PDF. One or more interferers are detected when L_(G)(X) exceeds threshold Γ₁. Evaluation of the GLRT numerator requires estimating how many interferers are present, which interferer is present in each frequency bin and the covariance matrix for each interferer. Thus, complexity of equation 27 increases exponentially with I.

Provided interference data is white (interference property 4, above) and the duration of x is less than the channel coherence time (interference property 1, above), it can be shown that C_(HS) _(i) _(HS) _(i) is diagonal for s_(i) produced through any linear modulation, irrespective of any channel conditions additional to the stated assumptions.

For C_(HS) _(i) _(HS) _(i) diagonal,

${p_{X}\left( {X\mspace{14mu} \ldots}\mspace{14mu} \right)} = {\prod\limits_{n = 1}^{N}\; {p_{X}\left( {X_{n}\mspace{14mu} \ldots}\mspace{14mu} \right)}}$

is simply the likelihood function of the univariate PDF, with dependency on variance not covariance. The GLRT simplifies to

$\begin{matrix} \begin{matrix} {{L_{G}(X)} = {\frac{\prod\limits_{n = 1}^{N}\; {p_{X}\begin{pmatrix} {{X_{n}{{\hat{M}}_{1}\mspace{14mu} \ldots \mspace{14mu} {\hat{M}}_{K}}},{\hat{1_{1}}\mspace{14mu} \ldots \mspace{14mu} {\hat{1}}_{K}},} \\ {{\hat{\sigma}}_{Z}^{2},{{\hat{\Omega}}_{1}^{2}\mspace{14mu} \ldots \mspace{14mu} {\hat{\Omega}}_{l}^{2}},\mathcal{H}_{1}} \end{pmatrix}}}{\prod\limits_{n = 1}^{N}{p_{Z}\left( {{X_{n}{\hat{\sigma}}_{Z}^{2}},\mathcal{H}_{0}} \right)}} > \Gamma_{1}}} \\ {= {\max\limits_{{M_{1}{\ldots M}_{K}},{{\hat{1}}_{1}\ldots {\hat{1}}_{K}}}{\frac{\prod\limits_{n = 1}^{N}{p_{X}\left( {{X{\hat{\sigma}}_{Z}^{2}},{{\hat{\Omega}}_{1}^{2}\mspace{14mu} \ldots \mspace{14mu} {\hat{\Omega}}_{l}^{2}},\mathcal{H}_{1}}\mspace{14mu} \right)}}{\prod\limits_{n = 1}^{N}{p_{Z}\left( {{X_{n}{\hat{\sigma}}_{Z}^{2}},\mathcal{H}_{0}} \right)}}.}}} \end{matrix} & 28 \end{matrix}$

Effectively, X now is partitioned into noise-only samples which have estimated variance {circumflex over (σ)}_(Z) ²<Γ₁′, for some threshold Γ₁′, and interference plus noise samples which have estimated variance

${{\hat{\sigma}}_{Z}^{2} + {\sum\limits_{i \in M_{k}}\Omega_{i}^{2}}} > {\Gamma_{1}^{\prime}.}$

Pragmatic selection of Γ₁′ can be made through Monte Carlo methods. For example, balance probabilities of false positive detection (type I error) and false negative detection (type II error) can be found empirically as a function of Γ₁′ at some interference to noise ratio. For cognitive radio applications, where avoidance of interference may have a higher priority than channel capacity, type I errors may be more acceptable than type II errors and Γ₁′ can be set accordingly.

The performance of the generalized likelihood ratio test in classifying the reference sample vector spectrum shown in FIG. 2, is shown in FIG. 3. The dashed line in FIG. 3 shows the likelihood ratio in each frequency bin, and the solid line shows the resulting classification such that a classification of zero indicates noise and a non-zero classification indicates signal. The threshold Γ₁′, was set as one in this example. It can be seen, by comparison of FIGS. 2 and 3, that the GLRT produced a number of type I and type II errors per bin for the reference sample vector.

The generalized likelihood ratio test described by equation 28, without constraints on the number of interferers or the number of sets of interferers, is unable to exploit the assumption that the ith interferer occupies L_(i) contiguous frequency bins X_(n) . . . X_(n+L) _(i) ⁻¹. This additional information may be exploited in a formal context by modelling X as an observation sequence vector produced by a hidden Markov process.

Using hidden Markov modelling the state of the nth frequency bin is denoted as

S_(n)ε[

. . .

]  29

The state

specifies the set of interferers M_(k) plus noise and state

contains only noise, recalling that M₀ is the empty set of interferers. Also implicit is that an interferer belonging to a set of interferers iεM_(k) has variance Ω², thus state

implies variance

${\hat{\sigma}}_{Z}^{2} + {\sum\limits_{i \in M_{k}}{\Omega_{i}^{2}.}}$

The observation probability for X_(n) in state

is

P _(k)(X _(n))

p(X _(n) |S _(n)=

)=p _(X)(X _(n) |M=M _(k)),  30

where p_(X)(X) is given in equation 11. A set of N observation probabilities is denoted P

[P_(k)(X_(n))] for n=1 . . . N and the initial state distribution is denoted I

[p(S₁=

)] for k=1 . . . K. An essential feature of this model is the transition probability between states

T _(lk)

p(S _(n+1)=

|S _(n)=

),  31

The posterior transition probabilities are

$_{lk} = \left\{ \begin{matrix} {\left( {R^{1_{l}} - 1} \right)/R^{1_{l}}} & {k = l} \\ {1/R^{1_{l}}} & {k = m} \\ {0,} & {{k \neq l},m,} \end{matrix} \right.$

where R^(l) ^(t) denotes the vector space, or number of elements, in l_(t), introduced in equation 26, and m is defined by the posterior state assignments S_(n−R) _(l) ^(t) ₊₁=S_(n−R) _(l) ^(t) ₊₂=S_(n)

, S_(n+1)=

for the state sequence [ . . . S_(n−R) _(l) ^(t) ₊₁, S_(n−R) _(l) ^(t) ₊₂, S_(n), S_(n+1) . . . ]. A set of N transition probabilities is denoted T

[T_(lk)] for l,k=0 . . . K. The output of defining the hidden Markov model for sample vector X is a state sequence S

S₁, S₂, . . . , S_(N) describing whether each frequency domain sample X_(n) is best classified as noise, S_(n)=

, or interference S_(n)ε[

. . .

]. This is achieved by finding the set of model parameters [T,P,I] which maximise p(XT,P,I).

The following sequential process describes one possible implementation for how the HMM described above can be used to estimate state sequence S.

First the noise variance is estimated. This is achieved by applying the method described previously.

Next a normal interference variance is produced. To be able to evaluate P_(k)(X_(n)), it is necessary to estimate interference variance. However, at this point in the process, it is not known how many interferers, if any, are present. Explicit estimation of variance for individual interferers is problematic. Further, estimation of variance for composite interference is of little value as this will tend produce an estimate dominated by any large interferers, leading to undesirable type II (false negative detection) errors for low power interferers. One means of reducing this problem is to produce a nominal interference variance by defining an acceptable type I (false positive detection) error threshold, that is, in absence of any explicit interference estimation. For example, a nominal interference variance

{circumflex over (σ)}_(I) ²

r{circumflex over (σ)}_(I) ²  33

will produce a type I error for any noise sample for which

${X_{n}}^{2} > {2\; {\hat{\sigma}}_{Z}^{2}\frac{r}{r - 1}\log \; {r.}}$

Now an initial classification is performed by estimating the hidden Markov model. To initialise the state sequence S, the two state set [

] is considered, where

is the noise state and

is the interference state. The observation probabilities P are evaluated using

P ₀(X _(n))=p _(Z)(X _(n))  34

for some noise PDF p_(Z)(•), such as one of equations 3, 4 or 5, and

P ₁(X _(n))=p _(X)(X _(n))|_(M=[1],σ) ₁ _(={circumflex over (σ)}) ₁ ,  35

for some interference-plus-noise PDF p_(X)(•), such as one of equations 11, 12, 13, 14, 15 or 16. Equation 15 can be used to produce P₁ for the Gaussian channel model. The set of transition probabilities T is initialised using some nominal τ₀, such that T₀₁=T₁₀=1/τ₀, T₀₀=T₁₁(τ₀−1)/τ₀, ∀n=1 . . . N. The initial value for the state sequence S is

$\begin{matrix} { = {\arg \; {\max\limits_{}{p\left( {{X\left. {,,,\mathcal{I}} \right){p\left(  \right.}},,\mathcal{I}} \right)}}}} & 36 \end{matrix}$

which may be produced efficiently by applying the Viterbi algorithm [28] to X with P, T as above I=[P₀(X₁),P₁(X₁)]. Complexity is set by the Viterbi algorithm window/traceback length.

A final classification is now performed by estimating the hidden Markov model. A first estimate of the actual number of interferers now can be made by sorting the K sub-sequences of S, with length R^(l) ^(k) , kε[1 . . . K], for which S_(n−R) _(l) ^(k) ₊₁=S_(n−R) _(l) ^(k) ₊₂= . . . =S_(n)=

by order of increasing variance. The K ordered sub-sequences now are reclassified as belonging to

. . .

, the estimated variance for each interferer in state

is required to define P_(k)(X_(n)). Finally, the state sequence S then may be updated for the expanded state set [

. . .

], again by evaluating 36 using the Viterbi algorithm. Interference is detected if S_(n)≠

for any nε[0 . . . N−1].

The state sequence S may be updated by again applying equation 36 for the updated hidden Markov model parameters using, for example, the Viterbi algorithm. This process can be iterated to improve classification reliability. One iteration empirically was observed to achieve good results in achieving the two state classification goal of classifying frequency bin samples as being either noise or interference plus noise.

FIG. 4 shows the hidden Markov model performance in classifying the reference sample vector spectrum shown in FIG. 2. The solid line in FIG. 4 shows the classification output of the second pass such that a classification of zero indicates noise and a non-zero classification indicates signal. The dashed line in FIG. 4 indicates the estimated means interference amplitude resulting from the classification and the associated state sequence. It can be seen, by comparison of FIGS. 2 and 4, that the hidden Markov model performed well in classifying the higher power interferers, but failed to detect the low power interferer occupying bins 64 through to 77 of the reference sample vector spectrum.

To estimate the probability of type I and type II errors as a function of signal-to-interference ratio (INR), and in order to set threshold parameters for the various methods, repeated detection trials were performed for a single, standard interferer. The size of the received signal vector was to set N=100, and the interferer was bandlimited to 20% of the received signal bandwidth with a low-pass-equivalent carrier frequency of zero. Thus, from equation 1, I=1 and, from equation 6, L=20, a=41 and b=60. The interferer was generated as being IID complex Gaussian in each frequency bin in k₁. Results for each method as a function of INR were obtained by averaging the performance of 1000 detections.

For each method, a family of results for probabilities of type I and type II errors as functions of INR was produced for different values of the respective threshold parameter. The threshold parameter value for which the probability of type I and type II errors was most similar at an INR of 3 dB, given in table I, then was used to produce the comparative results shown in FIGS. 5 a and 5 b.

TABLE I Optimum threshold parameter for each classification method, being that value for which the probabilities of type I and type II errors were most similar for a standard interferer at INR of 3 dB. Method Optimal Threshold Parameter Value Generalized log likelihood Γ₁ = 0.6 ratio test Hidden Markov modelling r = 6.25, τ₀ = 10 Power Spectrum I Γ_(z) = 0.6 Power Spectrum II Γ_(z) = 0.6 MUSIC/Eigenanalysis Γ_(z) = 0.3

FIGS. 5 a and 5 b show detection performance, in particular the probability of classification error per frequency bin averaged over 1000 trials for a N=100 length sample vector. One standardized interferer occupying 20% of the receiver input bandwidth was used for all simulations. The legend in (b) applies to both FIGS. 5 a and 5 b. For each method, the observed probability of type I errors is shown in (a) and the observed probability of type II errors is shown in (b).

FIG. 5 a indicates that the methods producing the lowest

$\sum\limits_{n = 1}^{N}{p_{{Type}\; 1}\left( {{\mathcal{H}_{1}X_{n}},\mathcal{H}_{0}} \right)}$

at all INRs were HMM, Power Spectrum 2 and MUSIC. FIG. 5 b indicates that the methods producing the lowest

$\sum\limits_{n = 1}^{N}{p_{{Type}\mspace{11mu} 11}\left( {{{\overset{.}{\mathcal{H}}}_{0}X_{n}},\mathcal{H}_{1}} \right)}$

were HMM and Power Spectrum II with HMM performing best at low INR. MUSIC produced the most type II errors. HMM and Power Spectrum II produced the lowest average probability of classification error, with HMM being the best performed overall.

A more rigorous performance assessment of the detection methods was performed by simulating four wideband interferers with random and unknown parameters. Each interferer bandwidth was randomly chosen between 0% and 20% of the receive bandwidth, with a low-pass-equivalent carrier frequency uniformly randomly distributed between −0.8 and +0.8 of the sample frequency. The INR for each interferer was uniformly randomly distributed in decibels between −10 dB and +10 dB. Interferers were generated as being IID complex Gaussian in each frequency bin in k₁ for each i. Classification errors were recorded for each length 100 sample vector and averaged over 100,000 detections for each method.

FIGS. 6 a and 6 b show the low INR detection performance, in particular the distribution of classification errors for an N=100 length sample vector aggregated from 100,000 trials. Four interferers of random bandwidth, power and demodulated carrier frequency were used in each simulation, with INRs uniformly distributed over the range −10 dB to +10 dB. Each legend also indicates the ensemble probability of classification error. For each method; the observed distribution of type I errors is shown in (a) and the observed distribution of type II errors is shown in (b).

FIG. 6 a shows that HMM produced the least type I errors, while FIG. 6 b shows that HMM and Power Spectrum 2 produced the least type II errors. FIG. 6 b shows that all methods produced relatively high numbers of type II errors, reflecting the inherent difficulty in detecting interferers having INRs of less than 0 dB. Overall, these results confirm HMM to be the most robust method, although the performance improvement over Power Spectrum 2 is not that great. The performance of all methods was found to improve with increasing INR.

The simulation results show that all of the methods trialled exhibited relatively high probabilities of producing classification errors averaged over the INR range. Most problematic for each method was the common case of a sample vector in which the instantaneous power spectral density (PSD) at either edge of an interferer passband was substantially lower than the noise PSD for that sample vector. For cognitive radio applications, where prevention of interference is paramount, inclusion of a “guard band” around each detected interferer would mitigate against the risk of interference resulting from a type II error, at the expense of increasing the probability of a type I error. Increasing the length of the sample vector and using more than one sample vector to make the detection decision also would improve performance of all methods.

FIG. 7 shows a receiver system showing a standard receiver front end and a processor arranged to detect interference in accordance with the invention. The signal received by the receiver is band limited to the approximate desired signal band width B by band pass filter 1. The band limited signal is then converted to base band and complex samples are contained at a rate f_(s)≧B sufficient to present a lessening of the desired signal. The signal then passes to a processor arranged to detect interference. The processor can be switched in or out of the circuit so that it is only able to detect interferes in a signal free period. This period may be N sample inter-frame space. In the first step of the process of the noise variance estimated. Firstly the sample can be Fourier transformed using a discrete Fourier transform matrix. The sample can then be sorted by order of increasing magnitude to produce a magnitude vector. A sub-vector from the magnitude vector is then found that minimises the distance from either the real of imaginary parts of the vector, the distance of the modulus of the vector from Rayleigh or the distance of the modulus squared from exponential. Once the sub-vector is found the voice variance is evaluated using a maximum likelihood estimate over the sub vector. After the noise variance is estimated, a nominal interference value is produced and then an initial classification is produced by estimating the hidden Markov model. The initial classification using the hidden Markov model can be performed using equations 34, 35 and 36. After the initial classification is performed a final classification is performed. This process can be iterated to improve reliability. Once interferers have been detected then the receiver can act to cancel the interferers.

When the detection system is implemented in a transmitter, the transmitter can detect any interferers and then transmit at frequencies not occupied by interferers. Likewise the receiver can upon detection of interferers only decode signals from bins not occupied by interferers if the transmitter is not set up in the corresponding manner.

The foregoing describes the invention including preferred forms thereof. Alterations and modifications as will be obvious to those skilled in the art are intended to be incorporated in the scope of the invention as defined by the accompanying claims. 

1-12. (canceled)
 13. A method of estimating noise variance comprising the steps of receiving a sample vector of noise and interference, sorting the sample vector in the frequency domain by order of increasing magnitude to produce an ordered vector, finding a sub-vector of the ordered vector that minimises the distance from a noise measure, and estimating the noise variance.
 14. A method according to claim 13 wherein the sub-vector is found as {right arrow over ({circumflex over (X)}_(Z)

[{right arrow over (X)}₁ . . . {right arrow over (X)}_(N) _(Z) ] where {right arrow over ({circumflex over (X)}_(Z) is the sub-vector and {right arrow over (X)}₁ . . . {right arrow over (X)}_(N) _(Z) are elements of the sub-vector.
 15. A method according to claim 13 or claim 14 wherein the sub-vector is found as the vector that minimises the distance from Gaussianity, such that {circumflex over (N)}{circumflex over (N_(Z))}=arg min_(N) _(Z) D({right arrow over (X)} _(N) _(Z) ) where D() is some measure of Gaussianity.
 16. A method according to claim 15 wherein the measure of Gaussianity is kurtosis.
 17. A method according to claim 15 wherein the measure of Gaussianity is the Kolmogorov-Smirnov distance.
 18. A method of detecting interference comprising the steps of receiving a sample vector X of noise and interference, finding a set of parameters that maximise p(X|T, P, I) where T is the transition probability between states, P is the observation probability and I is the initial state, determining a state sequence from the set of parameters and determining wideband interference from the state sequence.
 19. A method according to claim 18 wherein the step of finding a set of parameters that maximise p(X|T, P, I) includes the step of estimating noise variance.
 20. A method according to claim 19 wherein the noise variance is estimated by receiving a sample vector of noise and interference, sorting the sample vector in the frequency domain by order of increasing magnitude to produce an ordered vector, finding a sub-vector of the ordered vector that minimises the distance from a noise measure, and estimating the noise variance.
 21. A method according to claim 19 wherein the step of finding a set of parameters that maximise p(X|T, P, I) may further comprise the steps of producing a nominal interference variance and initialising a state sequence.
 22. A method according to claim 21 wherein the nominal interference variance may be found as {circumflex over (σ)}_(I) ²

r{circumflex over (σ)}_(Z) ² where 82 is the interference variance, r is a threshold value and {circumflex over (σ)}_(Z) ² is the noise variance.
 23. A method according to claim 18 wherein the optimal state sequence is found as $ = {\arg \; {\max\limits_{}{{p\left( {{X\left. {,,,\mathcal{I}} \right){p\left(  \right.}},,\mathcal{I}} \right)}.}}}$
 24. A method according claim 18 wherein the optimal state sequence is produced using the Viterbi algorithm. 